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Abstract 



The evolution of a continuous time Markov process with a finite number of states is usually 
calculated by the Master equation - a linear differential equations with a singular generator 
matrix. We derive a general method for reducing the dimensionality of the Master equation 
by one by using the probability normalization constraint, thus obtaining a affine differential 
equation with a (non-singular) stable generator matrix. Additionally, the reduced form yields 
a simple explicit expression for the stationary probability distribution, which is usually derived 
implicitly. Finally, we discuss the application of this method to stochastic differential equations. 

1 Introduction 

Let X (t) be a continuous time Markov process with discrete states {1, 2, M}, where 1 < M < oo, 
with Aij being the (non-negative) transition rate from state j to state i. We define pi (t) G [0, 1] to 
be the probability to be in state i at time t, the probability vector 



p(t) = (pi (t),...,p M (t)) 



T 



€ [0, 1] 



M 



(1.1) 



and the rate matrix A, so that 




(1.2) 



and 



dt 



Ap (i) 



(1.3) 



is the corresponding master equation, with solution 



p (t) = cxp (At) p (0) . 



(1.4) 
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From the normalization of the probability, p (t) must be constrained at all time by 



e T p(t) = l ; e4(l,l,...,l) T 



(1.5) 



Note that from the properties of A (specifically, the fact that e T A = 0), if we start from an initial 
condition p € [0, 1] M so that e T p = 1, then, Vi, e T p (t) = 1 automatically - though this is not 
immediately obvious from the above notation. 

In order to improve the interpretability of the above notation, we combine Eq. 11.51 directly with 
Eq. 11.31 We shall henceforth assume that X (t) is irreducible, and reduce the dimensionality of the 
problem from M to M — 1 (section^. Note that if instead X (t) is reducible with K connected 
components, then the method suggested here can be applied to each component separately, reducing 
the dimensionality of the problem from M to M — K (see appendix \^ . The reduced form of the 
master equation (Eq. 12.31 or Eq. 14. 3[) has some "nice" properties. For example, in section [3] we 
prove that the reduced form is strictly contracting; in section |4] we show it is easy to find a novel 
explicit form for the stationary (invariant) distribution using this reduced form (for the relation 
with previous stationary distribution expressions see appendix IB"]) ; and in section [5] we discuss the 
application of this method to stochastic differential equations (SDE) based on a population of 
independent Markov processes. 

Note that similar reduction methods are rather popular for the special case of a two state system 
x ^ 1 — x, in the context of deterministic kinetic equations, which are the limit of the SDE equations 
for an infinite population (e.g. j3j). In a few special cases they were also used in SDE descriptions 
of specific systems with more than one state [2]. 



2 Reduction of the Master Equation 

First, we make a few additional definitions: 
1. Im is the M x M identity matrix 



2. J is Im with it last row removed: J = 



/ 1 ••■ \ 
1 ••■ 



V ••• 10/ 



3. e M = (0,0,...,1) T 



( 1 
1 



4. H 4 (I M - e M e T ) J T = 



• o \ 

• 

• 

1 

V -i -i -i -i -i / 

5. p (t) = Jp (t), dim (p) = (M - 1) x 1 







dim (J) = (M - 1) X M 



dim (if) = M x (M - 1) 
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Note that p (t) 6 [0,1] M 1 , and the "hard" normalization constraint has been lifted (instead we 
remain with a "soft" constraint e T J T p (t) < 1). Using these definitions, we can use II .51 to write 

p(t) = e M + Hp(t) (2.1) 
Substituting this into Eq. 11.31 we obtain 

H^=Ae M + AHp(t) 

at 

Multiplying this by J from the left, we obtain 



Using the fact that 



JH #w = JAe m + JAH ~ , t . 

at 



JH = J (I e M e T ) J T = JJ T = Im-i , (2.2) 

where we used JeM = in the second equality. Defining A = J AH, b = JAeM, we can write our 
first reduced form of Eq. 11.31 

^=b + Ap(*). (2.3) 

3 Properties of A 

SinceA is a rate matrix of an irreducible process, it has a single zero eigenvalue and all the other 
eigenvalues have negative real parts [6]. Given this, we can find the eigenvalues of A. 

Theorem 1. Assume X (t) is an irreducible process, then A has the same eigenvalues as A - except 
its (unique) zero eigenvalue. 

Proof. To find the eigenvalues of A, we examine the characteristic polynomial 



A - AIm-i 



(2) 
(3) 
(4) 

(5) 



|JAH-AI M -i| 
A M - X |A- X JA (I - e M e T ) J T - I 



M-l 



M\ 



A M - 1 |A- 1 (l-e M e T )J T JA-I 
A -1 | (I - e M e T ) (I - e M eJj) A - AI M | 
A- x |A-AIm| 

M 

A^n^-Ai) 



M 
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where in (1) we used the definition of H and the fact that |AX| = A M |X| for any M x M matrix 
and scalar A, in(2) we used Sylvester's determinant theorem (\I p + BC| = \I p + CB| for all B, C 
matrices of size p x n and nxp respectively), in (3) we used J T J = (I — e« e^) and |AX| = A M |X| 
again, in (4) we used e T e^/ = 1 and e T A = and in (5) we denoted by the eigenvalues of 

A, with Ai = 0. The last line concludes the proof. □ 

Remark. Although the eigenvalues of A and A are the same, their corresponding eigenvectors v m 
and v m are not tied by a simple projection, namely v m ^ Jv m . 

Recall again that a rate matrix A of an irreducible process has a single zero eigenvalue and all the 
other eigenvalues have negative real parts [5J. Using theorem [T] this immediately gives 

Corollary 2. A is a stable matrix - i.e. all its eigenvalues have a strictly negative real part. 

Specifically, since A does not have any zero eigenvalues, 

Corollary 3. A is a non-singular matrix, and therefore, invertible. 



4 Stationary Distribution 

Recall ([6j) that if X (t) is irreducible then p (t) — > Poo, a stationary distribution which is the 
(unique) zero eigenvector of the matrix A, 

= A Poo . (4.1) 

This is an implicit equation for Poa. However, using the our reduced version, it is easy to find an 
explicit expression for the stationary distribution . 
Using Eq. 12.31 and Corollary 02 we define 

Poo 4 -A^b (4.2) 

and re- write Eq. 12.31 as 

Ml = A(p(t)-p 0o ), (4.3) 

which is our second reduced form of Eq. 11.31 

Since A is stable, p (t) — > poo, and so the solution of 14.31 is 

pW = Poo + (p(0)-poo)e At . 

And so, we found an explicit expression for the steady state distribution in the reduced form 

Poo = - (JAH) -1 JAe M . 

Returning to the original form, using Eq. 12. 1[ we obtain the explicit expression 

Poo = Um - H (JAH)- 1 JA) e M . (4.4) 



In section [B] we compare this expression with previous results. Note that for a discrete time 
Markov chain with transition matrix P, we can again find the stationary distribution by substituting 
A = I P in either Eq. OlorHTl 
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5 The reduction methods in stochastic differential equations 



Consider a population of identical, irreducible and independent Markov processes {X n (t)}„ =1 , 
where each process has states {1,2, ...,M}, where 1 < M < oo. Also, for all processes, Ay is the 
transition rate from state j to state i, and A is the corresponding matrix. We denote by Xi (t) the 
fraction of processes that are in state i at time t (not following convention of using upper case only 
for random variables). Formally 

1 N 

n=l 

where I [■] is the indicator function. Also, we denote x = (xi, xm) T ■ From normalization, 

e T x(i) = l ; e^(l,l,...,l) T . (5.1) 



As derived in [3], for large enough N we can approximate the dynamics of x by the following 
n— dimensional stochastic differential equation (SDE) 

A(t) = Ax(i)+B(x(t)K(t) (5.2) 

where £ is a vector of M (M — 1) /2 independent white noise processes with zero mean and cor- 
relation = S (t ~ t') ((■) denotes ensemble expectation), and B is a (sparse) M x 
M (M — 1) /2 matrix, with 



ife) V^- 



A r 



where fc is the index of a transition pair (i ^ j) and is index of the state connected to state i 
by transition pair k. Note that since N is large, any Ito correction would be of size O (A^~ 2 ) , and 
is therefore neglected here. 

We can reduce the form of Eq. 15.21 using 15.11 in a similar way as we did for the Markov process. 
Defining A = JAH (as before), B = JB (with xk replaced by 1 — xi — x^--- ~ xk-i) and Xoo = 

Poo = ^A^ JAe/vf , we obtain the following equation for the reduced state vector x = Jx 

^=A(x(t)-x 00 )+B(x(t))£(t). (5.3) 

As before A is a stable matrix. Additionally, the reduced diffusion matrix D = BB T is positive 
definite (in contrast to D = BB T , which is only semi-definite). This stems from the combination of 
the following facts: (1) D = BB T is symmetric (2) The rank of B is M — 1 (for irreducible X n (t)) 
(3) For any real matrix X, rank (XX T ) = rank(X) [T|. 



5 



Appendix 



A Generalization to a reducible processes 



Assume now that X (t) is a reducible process, with K connected components Cfc, k — {1, 2, K}, 
where contains states. In this case, we can write 



A = 



/ A« 

A( 2 ) 



\ 





\ o o ••• aw / 

Also, the normalization condition (Eq. II. 5p can be expanded to each component separately, 

Vfc : Us Tp (i) = ; (u k ) m ± 1 [m E C k ] 

where ^ fe q k — 1. In order to derive the reduced form of Eq. 11.31 in this case, we just have to find 
the reduced form for each component separately, and then concatenate the equations, reducing the 
dimensionality from M to M — K. Formally, we define: 

1. afc is the index of the last (M^ ) state in C k . 

2. L is Im with the rows corresponding to {a,k}^ = i removed. 

3. f is an length-M vector for which all the indices {ak} k =i equal q k and all the rest equal 0. 

4. H m as H with M = m. 



5. G = 



/ H M (i> 




V 





H M (2) 



H M (K) / 




6. p(t) = Jp(t) 
Using these definitions, we can use ll.5l to write 

p(t) = f + Gp (t) 

Substituting this into Eq. 11.31 we obtain 

,dp (t) 



(A.l) 



G- 



dt 



Multiplying this by J from the left, we obtain 

,dp(t) 



LG- 



dt 



Af + AGp (t) . 



LAf + LAGp (t) 
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Using the fact that 



LG = l M -K , (A.2) 
and defining A = LAG, b = LAf , we can write our first reduced form of Eq. 11.31 

dp(t) 



dt 



= b + Ap (t) . (A.3) 



which has dimension M — K. All the other results we derived for the irreducible case (i.e. the 
properties of A, the stationary distribution, etc.) can be similarly proven. 

B Relations to previous results - stationary distribution ex- 
pression 

In the main text (Eq. I4.4[) we derived an expression for the stationary distribution 

Poo = (lM + H(JAH)- 1 JA)e A f . (B.l) 



Note however, that this is not the first explicit form suggested for the solution of Eq. 14.11 For 
example, [5] proved that 

Poo = (A + ve T ) _1 v (B.2) 

for any v such that e T v ^ 0. 

Both Eq. 14.41 and Eq. IB. 21 must be equal and behave similarly if we vary A. For example, Eq. IB. II 
immediately implies that p^ does not change if we scale A — > cA by some non-zero constant, as 
implied by Eq. 14.11 This can be seen also in Eq. IB.2I if we scale v — > cv simultaneously with the 
scaling in A. 

To prove that both equations coincide (for any choice of v), we equate them, expecting to derive 
an identity: 

v = (A + ve T ) (l M + H(JAH)" 1 JA)e M 

= Ae M + AH (JAH) -1 JAe M + ve T e M + ve T H (JAH) -1 JAe M 

Since e T e^/ = 1 and e T H = 0, we obtain 

= Ae M + AH (JAH) -1 JAe M 

multiplying this by J from the left we get = 0, as expected. Multiplying by e T from the left also 
gives = 0, since e T A = 0. Since the row vectors of J, combined with e T , span the vector space 
E M , this concludes our proof. 
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